home *** CD-ROM | disk | FTP | other *** search
- Dan, you say
-
- <<
- I suppose you could come up with a DTD that describes something
- close to the current HTML, but I'm not sure of the value of it.
- HTML allows tags to be pretty much sprinkled wherever you feel
- like putting them. Any DTD that allows that much leeway just
- looks like this:
-
- <!ENTITY % alltags "TITLE|H1|H2|H3|MENU|OL|UL">
- <!ELEMENT %alltags (%alltags)*>
-
- i.e. every element is just a repeatable or-group of all the elements.
- Then the SGML parser can't do any minimization cuz nothing's required. >>
-
- Yes, current SGML currently is just a linear sequence of
- elements. (Sorry, current HTML -- I'm typing this in serially
- and can't edit!). There is a reason for this: it is very
- convenient for HTML to map onto a series of styles -- for two
- reasons.
-
- Firstly, a lot of rich text objects can hold styles but can't hold
- structure. You can deduce structure from the styles -- like
- Word deucing outlining from Heading styles, and WWW deducing
- a list <UL> from a lot of <LI> paragraphs. But you can't go
- very far. If you want to make a HT editor out of such a
- text object, you ahve to regenerate the elements from the
- styles.
-
- Secondly, it may be that the wysiwyg editors have a linear style
- structure because that is intuitive to people. I don't know
- a lot of people who use author/editor (which maintains
- structure). Maybe real people actually think in terms of styles
- and fix the document to look right, then they are happy to have the
- structure deduced.
-
- So if we went for a nestable HTML which would be cleaner for
- those who apreciate recursion, we would have to have a hypertext
- editor which made the structure visible. I don't have experience
- enough to know whether real information providers (group secretaries,
- for example) would be into generating nested elements -- maybe
- the styles are useful to keep as the current `user interface metaphor'
- of word processors.
-
- (It also makes making the editor easier!)
-
- Or maybe we should have two levels of DTD -- one basically linear
- and mandatory (and precompiled for fast access) and one more
- sophisticated for larger documents.
-
- Of course, when you are writing hypertext the large documents are
- normally broken down into small bits to make traveing them quick.
- So whereas each hypertext node may contain only H1 and H2 headings,
- when a book is generated a la the_www_book.ps you get 5 levels
- of heading from the whole tree.
-
- So that is why the HTML strcuture is so simple. I am open to
- a more sophisticated alternative.
-
- Tim
- ____________________________
- From connolly@pixel.convex.com Fri Jun 26 00:00:33 1992
- Return-Path: <connolly@pixel.convex.com>
- Received: from dxmint.cern.ch by nxoc01.cern.ch (NeXT-1.0 (From Sendmail 5.52)/NeXT-2.0)
- id AA02722; Fri, 26 Jun 92 00:00:27 MET DST
- Received: by dxmint.cern.ch (dxcern) (5.57/3.14)
- id AA25540; Fri, 26 Jun 92 00:00:11 +0200
- Received: from pixel.convex.com by convex.convex.com (5.64/1.35)
- id AA10700; Thu, 25 Jun 92 17:00:01 -0500
- Received: from localhost by pixel.convex.com (5.64/1.28)
- id AA05209; Thu, 25 Jun 92 17:00:00 -0500
- Message-Id: <9206252200.AA05209@pixel.convex.com>
- To: timbl@nxoc01.cern.ch (Tim Berners-Lee)
- Subject: Re: HTML DTD
- In-Reply-To: Your message of "Thu, 25 Jun 92 23:07:25 +0700."
- <9206252107.AA02534@ nxoc01.cern.ch >
- Date: Thu, 25 Jun 92 16:59:59 CDT
- From: Dan Connolly <connolly@pixel.convex.com>
- Status: R
-
-
- >thanks for that contribution. Not being as hot on SGML
- >as I ought to be, I don't see why the HREF has to refer to
- >and entity declared separately rather than directly having
- >a string argument.
- >
- That's actually left over from when I was trying to point
- HREF attributes to MIME attachments. It's not really
- necessary to move the UDIs into entities as long as you're
- careful that the UDI syntax is a subset of the SGML
- attribute literal syntax.
-
- Beware, for example, that an
- SGML parser will expand entity references in an attribute literal
- to produce the CDATA for the attribute value. So that
- <A HREF="A&P"> might be OK for the linemode browser,
- but an SGML parser will try to resolve &P.
-
- Also, SGML attribute values have a maximum length specified
- in the SGML declaration. The default value is 960 or something
- around there.
-
- >The title is in fact optional currently, by the way ...
- >we could keep it so though it "ought" always to have one.
- >
- >I'd like a DTD which as closely reflects the current HTML as
- >possible.
-
- I suppose you could come up with a DTD that describes something
- close to the current HTML, but I'm not sure of the value of it.
- HTML allows tags to be pretty much sprinkled wherever you feel
- like putting them. Any DTD that allows that much leeway just
- looks like this:
-
- <!ENTITY % alltags "TITLE|H1|H2|H3|MENU|OL|UL">
- <!ELEMENT %alltags (%alltags)*>
-
- i.e. every element is just a repeatable or-group of all the elements.
- Then the SGML parser can't do any minimization cuz nothing's required.
-
- > Then, if we change HTML to HTML2, I would
- >change it in a number of ways, in particular to include
- >separate header and body parts. I have come across the
- >"Davenport" group of publishers who are defineing DTDs for
- >technical documentation. They include Steve Newcombe who
- >is the HyTime guy (or one of the two I should say).
- >I would like to get some input from them.
- >
-
- Certainly we should keep tabs on things like the Davenport
- group and HyTime.
-
- But my immediate concern is these little sytactic differences
- that render HTML documents worthless to an SGML parser. The
- current HTML and UDI syntax make a good proof of concept, but
- we need to move toward formal definitions so that we can
- have confidence that correct implementations will interoperate.
-
- More later...
-
- Dan
-
-
-